Safety Ring: Fault-tolerant Distributed Process Execution in OSIRIS

نویسندگان

  • Nenad Stojnić
  • Heiko Schuldt
چکیده

The advent of service-oriented architectures (SOAs) has strongly facilitated the development and deployment of large-scale distributed (serviceoriented) applications. The middleware for orchestrating process-based applications that consist of several distributed services has to be inherently distributed as well, in order to provide a high degree of scalability and to avoid a single point of failure. Self-healing execution of such processes supported by a distributed middleware requires replicated control metadata and instance data of processes. Most importantly, replication has to be provided in a way that does not affect the adaptivity and elasticity behavior of the middleware for composite service execution. In this technical report, we introduce OSIRIS Safety Ring, a novel approach to fault-tolerant process execution. Safety Ring is based on OSIRIS, a distributed and decentralized middleware for the execution of composite services. Essentially, the Safety Ring exploits dedicated node monitors, organized in a self-organizing ring structure, for the replication of control data. Moreover, it leverages virtual stable storage for managing process instance data in a robust way. We present the architecture of OSIRIS’ Safety Ring and discuss in detail the algorithms it applies for self-healing process execution. The performance evaluation shows that the additional gain in robustness has only marginal effects on the scalability characteristics of the system.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Modeling Fault-Tolerant and Reliable Mobile Agent Execution in Distributed Systems

The reliable execution of a mobile agent is a very important design issue in building a mobile agent system and many fault-tolerant schemes have been proposed so far. To further develop mobile agent technology, reliability mechanisms such as fault tolerance and transaction support are required. For this purpose, we first identify two basic requirements for fault-tolerant mobile agent execution:...

متن کامل

Centralized Failure Injection for Distributed, Fault-Tolerant Protocol Testing

We describe a centralized approach to testing that distributed fault-tolerant protocols satisfy their safety and timeliness speciications in the presence of the very failures they are designed to tolerate. Cesium is a testing environment based on the centralized simulation of distributed executions and failures. Processes are run in a single address space while providing the appearance of a tru...

متن کامل

Preserving the Fault-Containment of Ring Protocols Executed on Trees

Reliable and fault-tolerant distributed systems have been attracting more and more attention (see Autonomic Computing Project by IBM, http://www-03.ibm.com/autonomic/). A self-stabilizing protocol is a fault-tolerant protocol that guarantees autonomous recovery from any number of and any type of faults that can affect the data stored locally at some process(es). If the impact of the faults can ...

متن کامل

Voting Algorithm Based on Adaptive Neuro Fuzzy Inference System for Fault Tolerant Systems

some applications are critical and must designed Fault Tolerant System. Usually Voting Algorithm is one of the principle elements of a Fault Tolerant System. Two kinds of voting algorithm are used in most applications, they are majority voting algorithm and weighted average algorithm these algorithms have some problems. Majority confronts with the problem of threshold limits and voter of weight...

متن کامل

Voting Algorithm Based on Adaptive Neuro Fuzzy Inference System for Fault Tolerant Systems

some applications are critical and must designed Fault Tolerant System. Usually Voting Algorithm is one of the principle elements of a Fault Tolerant System. Two kinds of voting algorithm are used in most applications, they are majority voting algorithm and weighted average algorithm these algorithms have some problems. Majority confronts with the problem of threshold limits and voter of weight...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012